AITopics | information theoretic limit

Collaborating Authors

information theoretic limit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Information theoretic limits of learning a sparse rule

Neural Information Processing SystemsDec-24-2025, 04:18:33 GMT

We consider generalized linear models in regimes where the number of nonzero components of the signal and accessible data points are sublinear with respect to the size of the signal. We prove a variational formula for the asymptotic mutual information per sample when the system size grows to infinity. This result allows us to derive an expression for the minimum mean-square error (MMSE) of the Bayesian estimator when the signal entries have a discrete distribution with finite support. We find that, for such signals and suitable vanishing scalings of the sparsity and sampling rate, the MMSE is nonincreasing piecewise constant. In specific instances the MMSE even displays an all-or-nothing phase transition, that is, the MMSE sharply jumps from its maximum value to zero at a critical sampling rate. The all-or-nothing phenomenon has previously been shown to occur in high-dimensional linear regression. Our analysis goes beyond the linear case and applies to learning the weights of a perceptron with general activation function in a teacher-student scenario. In particular, we discuss an all-or-nothing phenomenon for the generalization error with a sublinear set of training examples.

information theoretic limit, name change, sparse rule, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.60)

Add feedback

On the Information Theoretic Limits of Learning Ising Models

Rashish Tandon, Karthikeyan Shanmugam, Pradeep K. Ravikumar, Alexandros G. Dimakis

Neural Information Processing SystemsFeb-8-2025, 21:00:55 GMT

We provide a general framework for computing lower-bounds on the sample complexity of recovering the underlying graphs of Ising models, given i.i.d.

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

Review for NeurIPS paper: Information theoretic limits of learning a sparse rule

Neural Information Processing SystemsJan-25-2025, 16:18:00 GMT

Summary and Contributions: The paper considers the generalized linear model, in the high-dimensional sparse setting. In this setting the authors prove the existence of an'all-or-nothing phenomenon' (see GZ, RXZ) wherein there exists a function m_*(n) (sublinear in the ambient dimension n) such that - if the # samples m (1 eps) m_*, the error in estimating the signal vanishes; and - if m (1-eps) m_*, the error converges to the'trivial' error of estimating from the prior. Classical statistical or learning theory typically demonstrates a'graceful' decay of error as the number of samples grows large. The current work is in a line of work demonstrating a'sharp cutoff' phenomenon instead. This was initiated by GZ, RXZ in linear regression setting.

information theoretic limit, mutual information, neurips paper, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Information theoretic limits of learning a sparse rule

Neural Information Processing SystemsOct-10-2024, 12:36:00 GMT

all-or-nothing phenomenon, information theoretic limit, sparse rule, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Information Theoretic Limits of Learning Models

Neural Information Processing SystemsMar-13-2024, 06:47:48 GMT

We provide a general framework for computing lower-bounds on the sample complexity of recovering the underlying graphs of Ising models, given i.i.d.

graph, graph class, kl-divergence, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

On the Information Theoretic Limits of Learning Ising Models

Tandon, Rashish, Shanmugam, Karthikeyan, Ravikumar, Pradeep K., Dimakis, Alexandros G.

Neural Information Processing SystemsFeb-14-2020, 09:28:33 GMT

We provide a general framework for computing lower-bounds on the sample complexity of recovering the underlying graphs of Ising models, given i.i.d. While there have been recent results for specific graph classes, these involve fairly extensive technical arguments that are specialized to each specific graph class. In contrast, we isolate two key graph-structural ingredients that can then be used to specify sample complexity lower-bounds. Presence of these structural properties makes the graph class hard to learn. We derive corollaries of our main result that not only recover existing recent results, but also provide lower bounds for novel graph classes not considered previously. We also extend our framework to the random graph setting and derive corollaries for Erdos-Renyi graphs in a certain dense setting.

graph class, information theoretic limit, learning ising model, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback

Information Theoretic Limits for Linear Prediction with Graph-Structured Sparsity

Barik, Adarsh, Honorio, Jean, Tawarmalani, Mohit

arXiv.org Machine LearningMay-6-2017

We analyze the necessary number of samples for sparse vector recovery in a noisy linear prediction setup. This model includes problems such as linear regression and classification. We focus on structured graph models. In particular, we prove that sufficient number of samples for the weighted graph model proposed by Hegde and others is also necessary. We use the Fano's inequality on well constructed ensembles as our main tool in establishing information theoretic lower bounds.

artificial intelligence, machine learning, sample complexity, (17 more...)

arXiv.org Machine Learning

1701.07895

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.05)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.05)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

On the Information Theoretic Limits of Learning Ising Models

Tandon, Rashish, Shanmugam, Karthikeyan, Ravikumar, Pradeep K., Dimakis, Alexandros G.

Neural Information Processing SystemsDec-31-2014

We provide a general framework for computing lower-bounds on the sample complexity of recovering the underlying graphs of Ising models, given i.i.d. samples. While there have been recent results for specific graph classes, these involve fairly extensive technical arguments that are specialized to each specific graph class. In contrast, we isolate two key graph-structural ingredients that can then be used to specify sample complexity lower-bounds. Presence of these structural properties makes the graph class hard to learn. We derive corollaries of our main result that not only recover existing recent results, but also provide lower bounds for novel graph classes not considered previously. We also extend our framework to the random graph setting and derive corollaries for Erdos-Renyi graphs in a certain dense setting.

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

Add feedback

Information Theoretic Limits on Learning Stochastic Differential Equations

Bento, José, Ibrahimi, Morteza, Montanari, Andrea

arXiv.org Machine LearningMar-8-2011

Consider the problem of learning the drift coefficient of a stochastic differential equation from a sample path. In this paper, we assume that the drift is parametrized by a high dimensional vector. We address the question of how long the system needs to be observed in order to learn this vector of parameters. We prove a general lower bound on this time complexity by using a characterization of mutual information as time integral of conditional variance, due to Kadota, Zakai, and Ziv. This general lower bound is applied to specific classes of linear and non-linear stochastic differential equations. In the linear case, the problem under consideration is the one of learning a matrix of interaction coefficients. We evaluate our lower bound for ensembles of sparse and dense random matrices. The resulting estimates match the qualitative behavior of upper bounds achieved by computationally efficient procedures.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

1103.1689

Genre: Research Report (0.40)

Technology:

Information Technology > Mathematics of Computing (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback